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1.  INTRODUCTION 


Material  properties  of  kinetic  energy  penetrators  are  compared  at  the  Ballistic 
Research  Laboratory  in  a  1/4-scale  test  environment.  Metallurgists  fire  penetrators  of 
various  material  compositions  into  semi-infinite  steel  blocks  and  record  depths  of 
penetration.  Depth  of  penetration  behaves  approximately  as  a  linear  function  of 
velocity,  d(v),  over  the  range  of  the  four-velocity  design  routinely  employed.  Under  a 
common  slopes  assumption,  a  difference  in  performance  between  penetrators  k  and  /  is 
computed  as  d*(v)-d ;(v).  This  difference  is  determined  graphically,  occasionally  with 
the  benefit  of  a  least-squares  fit  to  each  performance.  Statements  of  significance  are 
not  made  at  present.  In  this  paper,  a  randomization  test  is  presented  as  a  means  for 
providing  analytical  support  for  inference. 

Inferences  drawn  from  such  experimentation  may  be  considered  the  result  of 
meta-analysis.  Meta-analysis  is  loosely  described  as  the  "integration  of  independent 
studies"  by  Hedges  and  Olkin  (1985).  This  area  has  received  much  recent  attention  in 
the  social  and  biological  sciences,  but  in  the  physical  and  engineering  sciences  it  has 
received  little  notice  with  the  exception  of  a  few  historical  papers  (e.g..  Tippet  [1931] 
and  Fisher  [1932])  that  have  been  classified  in  retrospect  as  meta-analyses.  The 
independent-studies  quality  of  the  aforementioned  problem  stems  from  the 
combination  of  data  sets  gathered  at  different  times  (often  different  years)  and  by 
different  experimenters.  This  fact,  practically  speaking,  invalidates  a  necessary 
assumption  for  normal  theory  analyses,  namely  the  belief  that  the  subjects  for  the 
combined  data  set  are  the  result  of  a  random  sample.  Taylor  and  Bodt  (1991) 
recommend  surmounting  this  problem  through  the  use  of  randomization  tests  and 
demonstrate  applicability  of  this  methodology  to  significance  testing  with  ballistic  data. 

The  purpose  of  this  paper  is  to  introduce  a  randomization  test  for  comparing  1/4- 
scale  kinetic  energy  penetrators.  A  description  of  the  data  collection  is  followed  by  the 
discussion  of  a  linear  model  through  which  significance  testing  of  relevant  contrast?  can 
be  made.  It  is  then  demonstrated  how  a  reference  distribution  for  determining 
significance  can  be  achieved  through  randomization.  Application  of  the  procedure  and 
discussion  of  the  results  follow. 


*  In  an  ideal  situation  one  would  design  a  multiyear  experiment  where  random  sampling  did  occur,  but  the  obstacles  are  so 
formidable  in  this  testing  environment  that  it  is  not  done. 
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2.  THE  DATA  COLLECTION 


The  measured  response,  d~,  is  the  depth  of  penetration  permitted  by  a  semi¬ 
infinite  steel  block  subjected  to  a  hemi-nose  penetrator  of  material  j,  fired  at  velocity  i. 
Semi-infinite  describes  the  independence  of  the  penetration  action  to  influences  from 
side  and  rear  free  surfaces  (i.e.,  the  block  is  for  practical  purposes  infinite  with  respect 
to  width  and  depth).  Hemi-nose  refers  to  the  hemispherical  configuration  of  the 
projectile  nose.  Figure  1  shows  the  cut-away  profile  of  a  semi-infinite  block,  where  the 
cut  is  made  along  the  shot  line.  Depth  of  penetration  is  taken  to  be  the  maximum 
normal  distance  between  the  original  entry-point  surface  and  the  bottom  surface  of  the 
hole. 


Depth  of  penetration  from  penetrators  of  several  different  material  compositions 
are  gathered  over  several  velocities.  The  design  structure  suggests  that  the 
experimental  units  are  the  semi-infinite  steel  blocks.  It  is  these  that  are  exposed  to  the 
two  treatments,  velocity  and  penetrator  material.  Velocity  is  included  as  a  test 
condition  because  it  will  affect  penetration  depth.  Penetrator  material  is  the  only 
treatment  of  interest -materials  are  to  be  compared  for  relative  effectiveness. 
Confidence  in  the  assessment  of  relative  performance  is  ensured  through  comparison 
over  a  range  of  velocities  meaningful  to  the  Army  application  (i.e.,  over  a  typical 
ordnance  velocity  range).  A  template  for  the  experiment  is  to  fire  each  penetrator 
(material)  once  at  each  of  the  following  four  nominal  velocities:  1100  m/s,  1300  m/s, 
1500  m/s,  and  1700  m/s.  Actual  velocities  will  vary.  A  design  matrix  overlaid  on  a 
combined  data  set  including  different  materials  might  appear  as  Figure  2. 

Other  facets  of  data  collection  influence  the  analysis.  Penetrators  are  tested  in 
separate  experiments,  quite  possibly  over  as  many  as  ten  years  if  the  purpose  is  to 
compare  new  materials  to  an  historical  control.  Small  sample  sizes  with  no  replication 
prevail  if  one  adheres  to  the  template  for  testing  materials.  There  is  no  random 
sampling  from  a  population  of  semi-infinite  blocks -indeed,  at  the  time  of  the  first 
experiment,  blocks  used  in  later  firings  may  have  not  yet  been  manufactured.  Even  if 
the  sample  were  random,  there  is  no  guarantee  that  the  population  is  normal,  nor  is  it 
likely  that  the  comfort  of  approximate  normality  can  be  afforded  by  the  Central  Limit 
Theorem  with  the  sample  sizes  and  replication  considered. 
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1  . — Extent  of  Penetration  -  — . — I 


Figure  1.  Cut-Away  Profile  of  a  Semi-Infinite  Steel  Block  After  Penetration. 


1100  1300  1500  1700 

Velocity  (m/s) 


Figure  2.  Template  for  Data  Collectioa 


3.  THE  UNEAR  MODEL 


A  linear  models  framework  is  presented  in  this  section  to  support  inference  for 
this  problem.  Great  detail  is  not  given.  For  a  comprehensive,  but  introductory, 
treatment,  it  is  suggested  the  reader  turn  to  Neter  and  Wasserman  (1974).  The 
problem  is  first  described  in  the  context  of  a  two-factor  factorial  design,  followed  by  a 
refinement  in  the  form  of  an  analysis  of  covariance  model.  A  convenient  regression 
form  of  this  model  is  then  used  to  construct  meaningful  contrasts.  Assumptions 
required  for  traditional  significance  testing  of  those  contrasts  are  also  discussed. 

3.1  Factorial  Design.  The  design  matrix  shown  in  Figure  2  and  the  problem 
description  suggest  that  a  factorial  design  may  be  appropriate,  with  penetrator  material 
serving  as  the  principal  treatment  under  study  and  velocity  serving  as  an  additional 
design  variable.  The  additive  model  is  expressed 

dij  =  A* +  Vi  +  Mi  +  eij’  0) 

where  n  is  the  common  mean  response,  V}  and  M-  are  the  effects  (shifts  from  that 
mean)  caused  by  the  ith  velocity  and  the  jth  material,  respectively,  and  e;.  is  the  error 
associated  with  the  (ij)th  response.  We  begin  by  assuming  a  Model-I  stance,  indicating 
that  both  material  and  velocity  be  treated  as  fixed  effects. 

Two  facts  render  this  approach  less  than  ideal.  The  first,  stated  in  the 
Introduction,  is  that  experimenters  know  that  velocity  behaves  approximately  linearly 
with  penetration  depth.  Even  further,  experience  has  shown  that  dA.(v)  and  d,(v)  are 
virtually  parallel  over  the  1100  m/s  to  1700  m/s  velocity  regime,  hence  the  additivity 

assumption  above.  Beyond  this  regime  the  assumptions  of  linearity  and  parallel  lines  do 

* 

not  hold.  The  second  is  that  although  four  nominal  velocities  are  intended,  the  actual 
velocities  tested  often  number  as  many  as  the  number  of  1/4-scale  rods  fired.  Because 
firing  velocity  cannot  be  completely  controlled,  each  nominal  velocity  actually 
encompasses  a  range  of  velocities  close  to  the  nominal.  Figure  3  illustrates  both 
linearity  and  firing  velocity  noise  in  replication  of  tungsten  alloy  firings  at  the  four 
nominal  velocities. 


•  Current  engineering  thought,  supported  by  high  velocity  testing  results,  is  that  the  lines  begin  slowly  converging  over  this  velocity 
regime,  with  more  rapid  convergence  occurring  well  beyond  1700  m/s. 
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This  additional  information  impacts  the  method  of  analysis.  Taking  advantage  of 
linearity  would  save  the  experimenter  degrees  of  freedom  to  apply  in  the  estimation  of 
error -more  efficiency  in  the  model  is  possible.  Left  unconsidered,  firing  velocity  noise 
would  increase  the  estimate  of  response  variability.  In  the  next  section  the  analysis  of 
covariance  model  is  suggested,  having  the  advantage  that  both  linearity  and  firing 
velocity  variation  can  be  incorporated. 

3.2  Analysis  of  Covariance. 

3.2.1  Traditional  Model.  The  linear  relationship  between  velocity  and  depth  of 
penetration  can  be  made  part  of  the  linear  model  as  follows.  Rewrite  Equation  1  as 

djj  =  M  +  (Mj.  -  M)  +  (m  j  -  m)  +  (djj  -  Mj.  -  Mj  +  M),  (2) 

where  m  is  again  the  common  mean  response  and  n{  and  Mj  are  the  true  mean  depths  of 
penetration  associated  with  the  ith  velocity  and  jth  material,  respectively.  Replace  the 
mean-shift  interpretation  for  the  velocity  effect  m;  -m  with  Md/V  to  represent  the  simple 
linear  relationship  between  velocity  and  the  mean  response.  Adding  and  subtracting 
Md/v  in  the  right  side  of  Equation  2  and  rearranging  terms  leaves 

dij  =  ^d/v  +  (Mj '  M)  +  (dij'Md/v~Mj  +  m)-  (3) 

Let  v^  represent  the  velocity  of  the  (ij)th  penetrator,  where  the  index  i  need  not  reflect 
the  nominal  velocities  in  Figure  2.  The  simple  linear  model  which  regresses  penetration 
depth  on  velocity  can  then  be  expressed  as  m  +  t^-v  ),  where  7  is  the  slope  of  the 
regression  and  v  is  the  sample  mean  velocity  based  on  velocity  observations  taken  over 
both  indices.  Substituting  this  for  Md/V  in  Equation  3  yields 

dij  =  M  +  T<Vjj  -  v  )  +  (»  .  -  m)  +  (d^  -  7(Vjj  -  v  )  -  Mj)*  (4) 

which  the  reader  recognizes  as  the  common  form  of  an  analysis  of  covariance  model. 

Certainly,  the  analysis  of  covariance  model  in  Equation  4  has  appeal  in  that  it  can 
account  for  the  contribution  to  penetration  depth  from  individual  velocities;  whereas,  in 
the  factorial  design  the  contribution  of  nominal  velocities  are  counted  as  being  the  same 
regardless  of  noise.  Further,  even  if  the  nominal  velocities  were  exactly  achieved,  there 
is  advantage  to  be  gained  in  introducing  the  linearity  information  in  the  model.  In  that 
case,  degrees  of  freedom  for  estimating  error  are  saved.  The  factorial  design  allows  s-2 
fewer  degrees  of  freedom  for  error,  where  s  represents  the  number  of  nominal 
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velocities.  This  follows  directly  from  the  fact  that  the  factorial  design  requires  s  - 1 
degrees  of  freedom  be  assigned  to  velocity;  whereas,  the  simple  linear  regression  needs 
only  one  degree  of  freedom  assigned  to  the  slope  to  account  for  the  influence  of 
velocity.  If  the  regression  is  perfect  (i.e.,  fits  exactly  to  the  mean  response  for  each 
nominal  velocity),  the  sum  of  squares  associated  with  error  for  both  models  is  identical, 
leaving  analysis  of  covariance  with  a  decided  advantage.  If  the  regression  is  not  perfect, 
a  tradeoff  is  made  wherein  degrees  of  freedom  for  the  error  term  denominator  are 
gained  at  the  expense  of  the  regression  lack-of-fit  being  added  in  the  numerator.  In 
consideration  of  data  with  a  strong  linear  relationship  like  those  displayed  in  Figure  3, 
an  analysis  of  covariance  approach  would  be  a  more  appropriate  choice  than  the  two- 
factor  factorial. 

Using  the  analysis  of  covariance  model  to  describe  the  problem  structure, 
questions  regarding  material  comparisons  can  be  answered  through  the  study  of 
contrasts.  If  the  experimenter  is  interested  in  the  difference  in  the  effect  of  any  two 
materials  k  and  /,  the  contrast  Mk-Mt  (i.e.,  fik-  n ,)  would  be  estimated  and  then  tested 
for  significance. 

3.2.2  Regression  Formulation.  It  is  convenient  to  reformulate  Equation  4  in  terms 
of  a  regression  model.  From  an  applications  perspective,  the  least-squares  approach  is 
more  widely  understood  and  accepted  by  practitioners.  Moreover,  the  parameters 
have  greater  intuitive  appeal,  and  their  meaning  conforms  to  how  experimenters  at  the 
Ballistic  Research  Laboratory  currently  think  of  the  problem. 

The  change  is  accomplished  easily.  Replace  the  material  effect,  having  t  distinct 
levels,  with  indicator  variables  m^,  k  =  1,2,  •  •  •  ,t-l,  defined  such  that 

m^.  =  1  if  the  observation  is  of  material  k; 

=  0  otherwise. 

The  columns  in  the  regression  design  matrix  corresponding  to  the  indicator  variables 


*  Regression  is  also  of  use,  computationally,  when  the  design  matrix  is  unbalanced. 
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•will  be  mutually  orthogonal.  Thus,  Equation  4  may  be  expressed  in  terms  of  a 
regression  model  as 

dij  =  /V  +  '  *  ‘  +  +  4(vij  -7 .)  +  eij’ where  (5) 

P0  =  H  +  Mt , 

Pk  =Mk-Mt,  k  =  1,2,  •  •  •  t-1, 

Pt=l- 

The  coefficients  0k  k=  1,2,  •  •  •  ,t  - 1  represent  the  difference  between  the  effect  of  the 
klU  and  tth  material  (i.e.,  the  vertical  difference  between  the  parallel  regression  lines 
dA.(v)-dt(v)).  The  designation  of  the  tth  material  is  arbitrary,  determined  by  how  the 
indicator  variables  are  defined.  In  the  design  matrix  for  the  regression  model,  the  rows 
corresponding  to  the  tth  material  would  have  zeros  in  the  columns  corresponding  to  the 
t-1  indicator  variables.  The  interpretation  of  the  0ks  would  be  most  natural  if  a 
reference  group  or  an  historical  control  was  denoted  the  tth  material.  Other 
comparisons  may  also  be  of  interest.  The  general  contrast  Mk-Mr  k,  l  j*t  is  obtained 
through  the  difference  Pk-0r 

In  this  section  the  treatment  effects  were  expressed  in  the  context  of  a  regression 
formulation  of  the  analysis  of  covariance  model.  Estimation  of  these  effects  can  be 
accomplished  after  first  determining  the  least  squares  estimate  of  the  coefficient  vector. 
The  next  step -and  the  main  focus  of  this  effort -is  to  determine  the  significance  of 
these  effects.  To  begin,  we  consider  conditions  for  test  validity. 

3.2.3  Assumptions.  Several  assumptions  are  required  to  support  the  usual  analysis 
of  covariance  for  this  problem.  They  appear  as  follows:  1)  the  regression  slopes  are 
nonzero  and  homogeneous  among  materials,  2)  velocity  is  unaffected  by  material, 

3)  velocity  is  precisely  measured,  4)  model  errors  are  distributed  with  zero  mean  and 
common  variance,  and  5)  the  responses  are  considered  jointly  independent  normal 
random  variables.  The  practical  implication  of  4)  and  5)  together  is  that  penetration 
depths  to  be  allowed  by  the  semi-infinite  blocks  constitute  a  random  sample  from  some 
conceptual  normal  population. 

The  first  four  assumptions  are  accepted;  the  last  is  not.  Velocity  obviously  affects 
penetration  depth,  and  data  support  the  similar-slopes  claim.  All  test  penetrators  are 
identical  in  geometry;  there  is  no  reason  to  expect  that  velocity  will  be  influenced  by 
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which  material  composition  is  being  tested.  Velocity,  though  not  completely  controlled, 
is  precisely  measured  using  an  x-ray  multiflash  system.  The  fourth  assumption  is 
common  to  nearly  all  modeling  efforts.  Replicate  data  provide  a  basis  for  investigating 
the  common  variance  claim,  but  error  with  zero  mean  must  remain  without  check.  As 
for  the  last  assumption,  there  is  no  reason  to  expect  that  penetration  depths  are  normal, 
and  because  of  the  individual-study  nature  of  the  experiments,  they  do  not  constitute  a 
random  sample. 

In  Section  4  we  relax  this  last  assumption  to  require  only  that  the  penetration 
depths  be  pairwise  uncorrelated.  With  that  change,  the  least-squares  estimation  of  the 
parameters  in  Equation  5  will  retain  the  usual  properties  of  uniform  minimum  variance 
among  linear  unbiased  estimators  but  without  any  known  distribution  on  which  to  base 
tests  of  significance.  Under  these  revised  model  assumptions,  an  alternative  test  for 
significance  is  given. 

4.  A  RANDOMIZATION  TEST 

In  this  section  a  randomization  test  is  proposed  as  a  means  to  discern  among 
statistically  different  materials.  Its  principal  advantages  are  freedom  from  the 
assumption  that  data  under  consideration  constitute  a  random  sample  from  a  normal 
population  and  the  ability  to  provide  exact  significance  levels.  Some  basic  foundation  is 
followed  by  a  description  of  the  test. 

4.1  Foundation.  A  randomization  test  is  a  method  through  which  significance 
testing  is  accomplished,  with  the  sampling  distribution  of  the  test  statistic  derived  from 
permutations  (combinations)  of  the  data.  A  test  of  significance  measures  the  numerical 
evidence  against  a  conjecture.  Data,  conveyed  through  a  suitable  test  statistic,  are 
examined  as  to  their  consistency  with  the  conjecture  by  comparing  the  observed  value 
of  the  test  statistic  to  its  sampling  distribution -formed  assuming  the  conjecture  is  true. 
Degrees  of  inconsistency  are  reflected  in  how  unusual  the  observed  test  statistic 
appears.  This  appearance  is  measured  in  terms  of  the  p-value,  the  probability  that  a 
value  of  the  test  statistic  is  at  least  as  unusual  (large  or  small)  as  the  one  observed. 

A  classical  analysis  in  this  1/4-scale  penetrator  environment,  based  on  the  model 
of  Section  3.2.2,  suggests  that  a  conjecture  (null  hypothesis)  of  either  H0:/^  =  0  or 
HQ:0k-0,  =  0  might  be  tested  to  compare  two  materials.  Consider  the  latter  hypothesis, 
a  claimed  equivalence  between  materials  k  and  /.  Letting  b  denote  a  least-squares 
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estimate  for  (3,  bk-bt  is  the  estimated  difference  between  materials  k  and  /  (i.e.,  the 
estimated  vertical  distance  between  their  parallel  regression  lines).  To  determine 
whether  the  distance  is  statistically  significant,  one  need  only  compare  b^-b,  to  its 
sampling  distribution.  This  distribution  is  readily  attainable,  but  only  if  one  is  willing  to 
assume  a  normal  random  sample -not  satisfied  here. 

Useful  significance  tests  are  possible  without  benefit  of  assumption  5).  In  what 
follows,  this  assumption  is  replaced  with  the  less  restrictive  condition  that  penetration 
depths  be  pairwise  uncorrelated,  thus  guaranteeing  desirable  properties  for  the  least- 
squares  estimators.  Before  proceeding  we  should  note  that  others  have  circumvented 
the  normality  requirement.  Nonparametric  approaches  to  this  problem  include  papers 
by  Quade  (1967),  Puri  and  Sen  (1969),  Shirley  (1981),  Conover  and  Iman  (1982),  and 
Stephenson  and  Jacobson  (1988).  All  focus  on  the  rank  transforms  of  either  the 
response  variable,  the  concomitant  variables,  or  both.  For  example,  Conover  and  Iman 
(1982)  transform  both  sets  of  variables  to  ranks  and  then  conduct  a  parametric  analysis 
of  covariance,  eventually  relying  on  the  F-distribution  to  determine  significance.  An 
exception  to  complete  reliance  on  ranks  is  found  in  Puri  and  Sen  (1969).  In  that  paper 
general  scores,  including  ranks,  are  adjusted  for  regression  on  the  concomitant 
variables,  and  the  asymptotic  distribution  of  the  test  statistic  based  on  those  scores  is 
developed  using  permutation.  The  hypothesis  tested  is  that  no  difference  exists  overall 
among  the  treatments  (materials)  studied.  A  related  approach  is  now  described, 
focusing  on  the  pairwise  comparison  of  materials. 

4.2  Description.  Consider  first  H0:/^  =  0.  The  geometrical  interpretation  of  (3k 
is  that  it  is  the  vertical  distance  between  the  parallel  regression  lines  dA.(v)  and  dt(v). 
This  fact  is  evident  from  Equation  5.  The  linear  effect  of  velocity  can  be  removed  by 
adjusting  the  penetration  depth  values  for  the  velocities  used  to  achieve  them -the 
remaining  difference  among  the  adjusted  values,  excluding  random  variation,  is 
attributable  to  material  and  is  expressed  pk.  This  difference  is  estimated  as  b^  by 
subtracting  the  average  of  the  residuals  resulting  from  material  t  from  those  of  material 
k ,  the  residuals  being  computed  relative  to  dt(v)  in  each  case.  Thus,  once  the  two 
groups  of  residuals  are  formed,  we  are  interested  in  the  difference  in  location  between 
them. 


*  Specifically,  the  null  hypothesis  for  the  randomization  test  is  that  penetration  depth  measurements  are  stochastically 
independent  of  the  penetrator  having  been  formed  from  material  k  or  material  t  (Edgington  1987). 
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To  determine  if  this  difference  is  significant,  we  need  only  establish  a  reference 
distribution  and  compare  the  observed  difference  to  it.  Under  the  null  hypothesis,  dA.(v) 
and  dt(v)  are  coincident.  Thus,  the  residuals  computed  after  adjusting  for  the  linear 
effect  of  velocity  should  be  homogeneous.  Therefore,  in  computing  bk,  the  distinction 
of  which  residuals  resulted  from  assignment  (association)  with  material  k  or  material  t 
should  make  little  difference.  The  reference  distribution  is  constructed  by  computing  bx. 

under  all  possible  assignments  of  residuals  (effectively  ignoring  material  distinction)  to 

* 

the  two  materials,  the  cardinality  of  each  material  set  being  preserved.  For  example,  if 
material  k  had  five  data  values  and  material  t  had  four,  there  would  be  (5+4)C5  values 
computed  for  b^.  The  p-value  for  the  two-sided  alternative  hypothesis  is  simply  the 
ratio  of  the  number  of  values  in  the  reference  distribution  which  equal  or  exceed  in 
absolute  value  the  observed  |  bj  to  the  total  number  of  combinations,  (5+4)C5. 

Significance  testing  for  the  hypothesis  H0  :/?£-/?,  =  0  is  achieved  similarly.  Adjust 
penetration  values  for  the  linear  effect  of  velocity  and  compute  residuals  in  the  same 
manner,  still  computing  the  residuals  relative  to  dt(v).  The  difference  between 
materials  is  estimated  by  b^  -  b,  and  computed  by  subtracting  the  average  of  the 
residuals  resulting  from  material  /  from  those  of  material  k.  The  reference  distribution 
arises  from  computing  fy.  -  b,  under  all  possible  assignments  of  residuals  between 
materials  k  and  /. 

Before  turning  to  examples,  some  more  detail  is  required  as  to  how  these 
residuals,  relative  to  dt(v)  are  computed.  From  Equation  5,  the  model  dt(v)  can  be 
expressed 

dt(v)=^o  +  ^t(v'v  >  (6) 

(The  indices  ij  have  been  suppressed  to  emphasize  that  this  is  a  model  for  penetration 
depth.)  Both  /?0  and  /?t  must  be  estimated.  Begin  with  slope.  Assuming  parallel 
penetration-against-velocity  models  d(v),  the  common  slope  is  taken  as  the  average 
within- materials  regression  slope,  bt,  which  can  be  delivered  by  any  regression 


*  This  rationale  presupposes  random  allocation  of  subjects  to  treatments.  However,  as  pointed  out  by  Edgington  (1987),  random 
allocation  principally  guards  against  undue  influence  resulting  from  between  or  within  subject  variability.  Such  variability  in  the 
context  of  semi-infinite  steel  blocks  is  considered  negligible  relative  to  the  material  differences  under  study. 
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subroutine  fitting  the  regression  expressed  as  Equation  5  in  its  complete  form.  The 
estimate  is  computed  as 

ESX*,-Vj)(d,-dj) 


SE(Vij-Vj)2 

»  j 

Usually,  d  t  would  serve  as  the  estimate  for  /?0  in  Equation  6.  However,  in  an  analysis  of 
covariance  d  t  must  be  adjusted  (adj.)  for  the  common  slope,  leaving 

"dt(adj.)=?fbt('rt-'r> 

This  too  will  be  delivered  by  a  regression  of  Equation  5  when  zeros  are  used  as  the 
values  for  the  t-1  indicator  variables  in  the  data  rows  corresponding  to  the  t  material. 
Estimating  /?0  and  Pt  by  d  t(adj  ^  and  bt,  respectively,  the  model  dt(v)  shown  in  Equation  6 
can  be  estimated  by 

aij(t)=?.(M].)  +  b.(vij-'r>  (7> 

Equation  7  is  merely  the  least-squares  fit  for  the  tth  material,  taking  into  consideration 
the  common  slope  over  all  materials.  The  residuals  for  the  j,h  material  relative  to  the 
t‘h  material  r~(t)  are  computed  as 

rij(0  =  dij  *  ^ij(t)1 

The  residuals  r)j(t)  are  then  manipulated  in  the  manner  described  above. 

5.  EXAMPLES 

In  this  section  two  examples  are  discussed.  The  purpose  of  the  first  is  to  provide  a 
synopsis  of  how  the  randomization  test  is  performed.  In  that  example,  data  are 
characteristically  sparse.  The  purpose  of  the  second  is  to  illustrate  performance  when 
data  are  slightly  more  abundant  and  when  the  data  collection  does  not  exactly  follow 
the  template  discussed  earlier.  Data  for  both  examples  were  extracted  from  an 
unpublished  manuscript  provided  by  Mr.  Timothy  Farrand  of  the  Ballistic  Research 
Laboratory. 

5.1  Example  1.  Figures  4-5  display  data  arising  from  the  firing  of  four  penetrator 
(material)  types  against  semi-infinite  steel  blocks.  All  penetrators  were  manufactured 
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Figure  4.  Depth  of  Penetration  for  Four  Materials,  L/D = 15. 
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Figure  5.  The  Relationship  of  Penetration  Depth  to  the  Fit  3^. 
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with  a  common  geometry,  mass  of  65  g,  and  length-over-diameter  ratio  (L/D)  equal  to 
15.  The  depleted  uranium  (DU)  penetrators  are  separated  according  to  Rockwell 
hardness  (Rc).  The  template  for  data  collection  given  in  Figure  2  was  followed  with 
regard  to  target  velocities.  Deviations  from  the  template  include  duplicate  97%- 
tungsten  results  at  1500  m/s  and  no  result  for  Du  Rc=45  at  1100  m/s.  Four  data  points 
are  the  most  recorded  for  any  material.  Data  are  listed  in  Table  1. 

Two  tasks  must  be  accomplished  on  the  way  to  significance  testing.  The  first  step 
is  the  estimation  of  dt(v).  In  this  example,  material  t  is  93%  tungsten.  Estimates  for 
the  parameters  0O  and  0t  will  result  from  regressing  penetration  depth  on  velocity  and 
the  three  indicator  variables  found  in  Equation  5.  The  values  for  the  indicator  variables 
m^,  nr-j,  and  m^  are  shown  in  Table  1.  It  follows  that  0V  02,  and  03  represent 
differences  from  93%  tungsten  (our  control)  and  97%  tungsten,  DU  Rc=49,  and  DU 
Rc=45,  respectively.  The  estimated  penetration  depths  for  material  t  are  given  by 

aij(t)  =  73.7310  +  0.1035(Vjj  -  1395) 

which  is  graphed  as  9^  in  Figure  5.  (A  slope  of  0.1035  also  well  explains  the  effect  of 
velocity  on  penetratipn  depth  for  the  other  three  materials.)  Residuals,  r^,  are 
computed  as  d;.  -  3^.  Table  2  lists  the  residual  values  for  each  material.  In  Figure  6 
these  residuals  are  plotted  about  the  horizontal  line,  r^  =  0. 

To  determine  significance,  the  rij(t^  are  permuted  between  the  materials  being 
compared.  Consider,  for  example,  the  two  DU  materials.  Their  difference  is  estimated 
by  b2  -  b3  and  takes  on  the  value  2.514,  the  average  of  the  residuals  of  DU  Rc=49  less 
the  average  of  the  residuals  of  DU  Rc=45.  The  reference  distribution  for  determining 
significance  is  constructed  by  computing  b2  -  b3  for  each  possible  combination  of  the 
residuals.  Figure  7  depicts  one  such  combination  where  four  residuals  were  reassigned. 
In  that  instance  b2  -  b3  =  1.535,  one  of  7C4  =  35  reference  distribution  values 
computed  under  the  null  hypothesis  of  no  difference  between  the  two  DU  penetrators. 
Figure  8  displays  the  reference  distribution  in  the  form  of  a  stem-plot.  The  observed 
value  for  b2  -  b3  is  circled.  There  are  six  distribution  values  which  equal  or  exceed  in 
absolute  value  |  b2  -  b3 1  (denoted  by  bold  type  in  Figure  8),  hence  a  p-value  of  6/35  or 
0.171.  (Of  the  two  entries  in  Figure  8  having  absolute  value  of  2.5,  one  is  listed  with  a 
superscript  to  indicate  that  it  would  actually  appear  smaller,  in  absolute  value,  than  its 
counterpart  if  more  decimal  places  were  listed.)  Table  3  includes  the  results  of  each 
pairwise  material  comparison.  A  difference  can  be  claimed  between  the  DU  materials 
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Table  1.  Data  Matrix  for  L/D  =  15 


djj  (mm) 

Vjj  (m/s) 

miji 

mij2 

mii3 

42.70 

1098 

1 

0 

0 

97%  tungsten 

66.80 

1304 

1 

0 

0 

78.20 

1489 

1 

0 

0 

89.70 

1507 

1 

0 

0 

58.42 

1067 

0 

1 

0 

DU  Rc=49 

85.34 

1314 

0 

1 

0 

1481 

0 

1 

0 

fta 

1654 

0 

1 

0 

78.99 

1304 

0 

0 

1 

DU  Rc=45 

99.06 

1482 

0 

0 

1 

116.33 

1660 

0 

0 

1 

39.12 

1086 

0 

0 

0 

93%  tungsten 

65.02 

1297 

0 

0 

0 

83.31 

1500 

0 

0 

0 

105.92 

1682 

0 

0 

0 

Table  2.  Residuals  Relative  to  the  tth  Material  for  L/D  =  15 


97%  tungsten 

DU  Rc=49 

DU  Rc=45 

93%  tungsten 

-0.29 

18.65 

14.68 

-2.63 

rij(t) 

2.49 

20.00 

16.32 

1.43 

-5.26 

18.46 

15.16 

-1.29 

4.38 

14.52 

2.49 

15 


Velocity  (m/s) 

Figure  6.  Residuals  Relative  to  the  t“  Material 
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Figure  7.  One  Possible  Reallocation  of  Residuals 
Between  Materials  2  and  3. 
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Figure  8.  Stem-plot  Representation  of  the  Distribution 
of  the  Test  Statistic,  b2-b3. 

Table  3.  Significance  of  the  Differences  Observed  in  Example  1 


L/D  = 

15 

Randomization 

Contrast 

Estimate 

#  unusual 

#  permutations 

p-value 

h 

03298 

60 

70 

0.857 

02 

17.9032 

2 

70 

0.029 

03 

153890 

1 

35 

0.029 

01-02 

-173734 

2 

70 

0.029 

02-03 

2.5142 

6 

35 

0.171 

01-03 

-15.0592 

1 

35 

0.029 
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with  a  probability  of  0.171  of  being  wrong.  In  consideration  of  the  data,  all  p-values 
appear  reasonable  and  act  to  quantify  the  differences  observed. 

5.2  Example  2.  A  second  data  set  is  displayed  in  Figure  9.  Three  65-g  penetrators 
were  tested,  each  with  L/D  =  10.  Unlike  in  the  previous  example,  data  were  not 
collected  strictly  according  to  the  template  in  Figure  2.  They  need  not  be  for  the 
randomization  test  to  be  valid.  Also,  the  distinction  between  groups  do  not  appear  as 
great  as  in  Example  1.  It  is  in  this  situation  that  an  explicit  quantification  of  any 
differences  is  most  needed  because  it  becomes  even  less  clear  how  much  observed 
difference  is  real  and  how  much  is  attributable  to  chance  variations. 

Table  4  lists  the  results  for  all  pairwise  comparisons  between  materials.  The 
increased  sample  sizes  over  the  previous  example  allows  for  a  finer  resolution  in  the 
number  of  reference  distribution  values.  There  are  12,870  values  comprising  the 
reference  distribution  for  b2,  the  estimated  difference  between  97%  tungsten  and  93% 
tungsten.  The  p-value  for  the  randomization  test  is  0.192,  meaning  that  the  probability 
is  0.192  of  observing  a  value  for  bx  at  least  as  unusual  as  1.4050.  Generally,  such  a  p- 
value  would  not  be  considered  significant,  suggesting  that  97%  tungsten  and  93% 
tungsten  are  performing  similarly  for  L/D =10  penetrators. 

A  second  contrast  P1  -  02,  signifying  the  difference  between  97%  tungsten  and  DU, 
is  estimated  to  be  -4.5113.  It  is  not  clear  from  the  examination  of  Figure  9  that  this 
constitutes  a  real  difference  in  performance.  The  randomization  test,  however,  yields  a 
p-value  of  0.0040  and  provides  solid  justification  for  the  metallurgist’s  claim  that  97% 
tungsten  and  DU  materials  are  performing  differently.  A  difference  between  these 
materials  was  observed  by  Magness  (1990). 

6.  CONCLUSION 

For  the  testing  of  1/4-scale  kinetic  energy  penetrators  against  semi-infinite  steel 
blocks,  the  technical  considerations  and  the  procedures  addressing  them  are  long 
established.  It  is  the  intent  of  this  effort  to  enhance  the  inferential  process  within  the 
presiding  experimental  structure.  Presently,  once  data  are  collected,  inferences 
principally  consist  of  an  engineering  judgment  as  to  the  meaning  of  an  observed  vertical 


*  No  discussion  in  this  report  is  devoted  to  controlling  the  error  rate  for  multiple  contrasts.  For  more  detail,  see  Kirk  (1982). 
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Velocity  (m/s) 

Figure  9.  Depth  of  Penetration  for  Three  Materials,  L/D = 10. 


Table  4.  Significance  of  the  Differences  Observed  in  Example  2 


_ 1 

L/D  = 

10 

Randomization 

Contrast 

Estimate 

#  unusual 

#  permutations 

p-value 

fit 

1.4050 

2472 

12870 

0.1921 

h 

5.91630 

3 

6435 

0.0005 

h-h 

-4.5113 

26 

6435 

0.0040 

gap  between  linear  functions  representing  the  penetration  performance  of  two 
materials.  The  initial  motivation  for  pursuing  this  problem  was  the  engineer’s  lament 
that,  occasionally,  when  his  judgment  was  questioned,  he  had  little  recourse  but  to  stand 
firm  on  his  opinion  forged  from  years  of  experience.  The  linear  functions  themselves 
are  usually  established  subjectively  and  are  considered  parallel  over  the  range  of 
1100  m/s  to  1700  m/s.  Such  subjectivity  does  bring  into  question  the  consistency  of  the 
assessment  process.  An  objective  method  for  fit,  such  as  least  squares,  is  seldom  used, 
and  then  not  in  such  a  way  as  to  incorporate  the  common  slopes  assumption.  Nor  need 
it  be  in  all  instances.  Often,  the  differences  are  so  great  as  to  allow  for  the  approximate 
fitting  of  the  linear  functions  with  no  loss  to  the  outcome,  but  perhaps  equally  as  often 
they  are  not  great,  occurring  when  only  marginal  improvements  are  made  over  an 
historical  (control)  material. 

In  summary,  the  report  identifies  the  experimental  situation  as  being  similar  to 
that  in  which  an  analysis  of  covariance  model  is  usually  employed  and  then  expresses 
the  linear  model  in  a  manner  conforming  to  how  practitioners  currently  view  the 
problem,  even  to  the  extent  of  automatically  incorporating  the  parallel  lines  assumption. 
The  report  then  explores  some  important  problems,  such  as  data  arising  from 
independent  studies,  in  implementation  of  the  classical  method  for  significance  testing 
and  recommends  an  alternative  to  surmount  these  problems  in  the  form  of  a 
randomization  test.  This  test  is  implemented  on  two  sets  of  real  data,  and  its 
application  in  the  context  of  those  data  is  demonstrated. 

The  approach  presented  is  an  attempt  at  developing  a  unifying  structure  within 
which  inferences  in  this  environment  can  be  made  both  quantifiable  and  consistent. 

The  recommended  procedure  combines  existing  techniques  such  as  least  squares  with  a 
new  application  of  a  randomization  test  in  determining  the  significance  of  observed 
material  differences.  With  this  test  supporting,  practitioners  can  make  definitive 
statements  as  to  the  statistical  significance  of  material  differences  observed. 
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GLOSSARY 


additive  -  No  terms  formed  as  products  of  other  terms  in  the  model  are  present. 

analysis  of  covariance  -  A  method  involving  removing  the  effect  of  an  explanatory 
variable  on  a  response,  leaving  residuals  to  be  analyzed  to  determine  treatment 
effects. 

Central  Limit  Theorem  -  Guarantees  approximate  normality  for  a  sum  of  responses 
based  on  large  samples. 

combination  -  Denoted  mCn  and  meaning  the  number  of  groups  of  n  items 
that  can  be  formed  from  m  items,  without  regard  to  order. 

contrast  -  A  linear  combination  of  means  in  which  the  coefficients  sum  to  zero. 

degrees  of  freedom  -  Refers  to  the  amount  of  information  yielded  by  the  data.  In 
general,  it  is  desirable  to  have  many  degrees  of  freedom  in  the  denominator  of 
the  error  term. 

factorial  design  -  An  efficient  method  for  collecting  data  characterized  by  the  gathering 
of  data  over  all  treatment  level  combinations. 

fixed  effect  -  A  treatment  whose  influence  on  the  response  is  to  shift  the  response 
mean  in  accordance  with  the  levels  of  the  treatment. 

linear  model  -  Refers  to  a  statistical  model  which  is  linear  with  respect  to  the 
coefficients  to  be  estimated. 

p-value  -  The  probability  of  being  in  error  when  claiming  that  a  difference 
has  been  observed  (i.e.,  claiming  that  the  alternative  hypothesis  is  true). 

random  sampling  -  A  process  of  selecting  n  members  from  a  population  in  such  a 
way  that  each  n-member  group  has  an  equally  likely  chance  of  selection. 
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reference  distribution  *  Under  the  null  hypothesis,  the  possible  values  that  the  test 
statistic  might  take  on  and  the  frequency  with  which  they  are  taken  on. 

residual  -  The  departure  of  the  model  from  the  data,  measured  as  the  observed 
data  value  less  the  model  prediction. 

significance  -  Refers  to  the  magnitude  of  the  p-value. 

stem-plot  -  A  graphical  display  similar  to  a  histogram,  where  the  bars  are  formed 
by  listing  the  final  digit  of  all  values  with  the  same  leading  digit(s).  The 
leading  digit(s)  serve  to  locate  the  bar  on  the  axis. 
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